فیلترها/جستجو در نتایج    

فیلترها

سال

بانک‌ها




گروه تخصصی











متن کامل


نویسندگان: 

XU L. | CHOW M. | TAYLOR L.S.

اطلاعات دوره: 
  • سال: 

    2006
  • دوره: 

    -
  • شماره: 

    -
  • صفحات: 

    0-0
تعامل: 
  • استنادات: 

    1
  • بازدید: 

    160
  • دانلود: 

    0
کلیدواژه: 
چکیده: 

شاخص‌های تعامل:   مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

بازدید 160

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resourcesدانلود 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resourcesاستناد 1 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resourcesمرجع 0
نویسندگان: 

CHAWLA N. | JAPKOWICZ N. | KOLCZ A.

نشریه: 

ACM SIGKDD EXPLORATIONS

اطلاعات دوره: 
  • سال: 

    2004
  • دوره: 

    6
  • شماره: 

    1
  • صفحات: 

    1-6
تعامل: 
  • استنادات: 

    1
  • بازدید: 

    282
  • دانلود: 

    0
کلیدواژه: 
چکیده: 

شاخص‌های تعامل:   مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

بازدید 282

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resourcesدانلود 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resourcesاستناد 1 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resourcesمرجع 0
اطلاعات دوره: 
  • سال: 

    2012
  • دوره: 

    3
  • شماره: 

    2
  • صفحات: 

    1-9
تعامل: 
  • استنادات: 

    0
  • بازدید: 

    360
  • دانلود: 

    0
چکیده: 

Fuzzy rule-based classification system (FRBCS) is a popular machine learning technique for classification purposes. One of the major issues when applying it on imbalanced data sets is its biased to the majority class, such that, it performs poorly in respect to the minority class. However many cases the minority classes are more important than the majority ones. In this paper, we have extended the basic FRBCS in order to decrease the side effects of imbalanced data by employing data-mining criteria such asconfidence and support. These measures are computed from information derived from data in the sub-spaces of each fuzzy rule. The experimental results show that the proposed method can improve the classification accuracy when applied on benchmark data sets.

شاخص‌های تعامل:   مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

بازدید 360

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resourcesدانلود 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resourcesاستناد 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resourcesمرجع 7
مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources
اطلاعات دوره: 
  • سال: 

    1402
  • دوره: 

    21
  • شماره: 

    4
  • صفحات: 

    273-283
تعامل: 
  • استنادات: 

    0
  • بازدید: 

    150
  • دانلود: 

    29
چکیده: 

در عصر کلان داده ها، تکنیک های تجزیه و تحلیل خودکار مانند داده کاوی به طور گسترده ای برای تصمیم گیری به کار گرفته شده و بسیار مؤثر واقع شده اند. از جمله تکنیک های داده کاوی می توان به طبقه بندی اشاره کرد که یک روش رایج برای تصمیم گیری و پیش بینی است. الگوریتم های طبقه بندی به طور معمول بر روی مجموعه داده های متوازن به خوبی عمل می کنند. با وجود این، یکی از مشکلاتی که الگوریتم های طبقه بندی با آن مواجه هستند، پیش بینی صحیح برچسب نمونه های جدید بر اساس یادگیری بر روی مجموعه داده های نامتوازن است. در این نوع از مجموعه داده ها، توزیع ناهمگونی که داده ها در کلاس های مختلف دارند باعث نادیده گرفته شدن نمونه های کلاس با تعداد نمونه کمتر در یادگیری طبقه بند می شوند؛ در حالی که این کلاس در برخی مسائل پیش بینی دارای اهمیت بیشتری است. به منظور مقابله با مشکل مذکور در این مقاله، روشی کارا برای متعادل سازی مجموعه داده های نامتوازن ارائه می شود که با متعادل نمودن تعداد نمونه های کلاس های مختلف در مجموعه داده ای نامتوازن، پیش بینی صحیح برچسب کلاس نمونه های جدید توسط الگوریتم یادگیری ماشین را بهبود می بخشد. بر اساس ارزیابی های صورت گرفته، روش پیشنهادی بر اساس دو معیار رایج در ارزیابی طبقه بندی مجموعه داده های نامتوازن به نام های «صحت متعادل» و «ویژگی»، عملکرد بهتری در مقایسه با روش های دیگر دارد.

شاخص‌های تعامل:   مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

بازدید 150

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resourcesدانلود 29 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resourcesاستناد 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resourcesمرجع 0
اطلاعات دوره: 
  • سال: 

    2022
  • دوره: 

    16
  • شماره: 

    4
  • صفحات: 

    123-129
تعامل: 
  • استنادات: 

    0
  • بازدید: 

    45
  • دانلود: 

    0
چکیده: 

Through the use of malware, particularly JavaScript, cybercriminals have turned online applications into one of their main targets for impersonation. Detection of such dangerous code in real-time, therefore, becomes crucial in order to prevent any harmful action. By categorizing the salient characteristics of the malicious code, this study suggests an effective technique for identifying malicious Java scripts that were previously unknown, employing an interceptor on the client side. By employing the wrapper approach for dimensionality reduction, a feature subset was generated. In this paper, we propose an approach for handling the malware detection task in imbalanced data situations. Our approach utilizes two main imbalanced solutions namely, Synthetic Minority Over Sampling Technique (SMOTE) and Tomek Links with the object of augmenting the data and then applying a Deep Neural Network (DNN) for classifying the scripts. The conducted experiments demonstrate the efficient performance of our approach and it achieves an accuracy of 94. 00%.

شاخص‌های تعامل:   مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

بازدید 45

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resourcesدانلود 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resourcesاستناد 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resourcesمرجع 0
نویسندگان: 

VISA SOFA | RALESCU ANCA

اطلاعات دوره: 
  • سال: 

    2005
  • دوره: 

    -
  • شماره: 

    6
  • صفحات: 

    67-73
تعامل: 
  • استنادات: 

    1
  • بازدید: 

    224
  • دانلود: 

    0
کلیدواژه: 
چکیده: 

شاخص‌های تعامل:   مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

بازدید 224

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resourcesدانلود 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resourcesاستناد 1 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resourcesمرجع 0
مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources
نویسنده: 

Sardari Sahar | EFTEKHARI MAHDI

اطلاعات دوره: 
  • سال: 

    2016
  • دوره: 

    6
تعامل: 
  • بازدید: 

    146
  • دانلود: 

    0
چکیده: 

RECENTLY, NEW FUZZY DECISION TREE (FDT) APPROACHES HAVE BEEN DEVELOPED FOR DOING CLASSIFICATION TASKS. IN THIS PAPER, ONE OF THESE FDTS IS ADAPTED FOR PERFORMING THE IMBALANCED CLASSIFICATION TASKS. FIRST, OUR PROPOSED METHOD UTILIZES K-MEANS ALGORITHM TO CLUSTER THE MAJORITY CLASS SAMPLES INTO SOME CLUSTERS. THEN, EACH CLUSTER IS LABELED AS A NEW CLASS AND THEREBY THE BINARY IMBALANCED CLASSIFICATION PROBLEM IS CONVERTED TO THE MULTI-CLASS CLASSIFICATION PROBLEM. EVENTUALLY, FDT ALGORITHM IS EMPLOYED FOR CLASSIFYING THE NEW DATA SET. THE OBTAINED RESULTS SHOW THAT OUR PROPOSED METHOD OUTPERFORMS ALMOST ALL THE OTHER FUZZY RULE BASED APPROACHES OVER HIGHLY IMBALANCED DATA SETS.

شاخص‌های تعامل:   مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

بازدید 146

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resourcesدانلود 0
نویسندگان: 

BaniMustafa Ahmed

اطلاعات دوره: 
  • سال: 

    2019
  • دوره: 

    11
  • شماره: 

    3
  • صفحات: 

    79-89
تعامل: 
  • استنادات: 

    0
  • بازدید: 

    213
  • دانلود: 

    0
چکیده: 

This paper presents a data mining application in metabolomics. It aims at building an enhanced machine learning classifier that can be used for diagnosing cachexia syndrome and identifying its involved biomarkers. To achieve this goal, a data-driven analysis is carried out using a public dataset consisting of 1H-NMR metabolite profile. This dataset suffers from the problem of imbalanced classes which is known to deteriorate the performance of classifiers. It also influences its validity and generalizablity. The classification models in this study were built using five machine learning algorithms known as PLS-DA, MLP, SVM, C4. 5 and ID3. This model is built after carrying out a number of intensive data preprocessing procedures to tackle the problem of imbalanced classes and improve the performance of the constructed classifiers. These procedures involves applying data transformation, normalization, standardization, re-sampling and data reduction procedures using a number of variables importance scorers. The best performance was achieved by building an MLP model that was trained and tested using five-fold cross-validation using datasets that were re-sampled using SMOTE method and then reduced using SVM variable importance scorer. This model was successful in classifying samples with excellent accuracy and also in identifying the potential disease biomarkers. The results confirm the validity of metabolomics data mining for diagnosis of cachexia. It also emphasizes the importance of data preprocessing procedures such as sampling and data reduction for improving data mining results, particularly when data suffers from the problem of imbalanced classes.

شاخص‌های تعامل:   مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

بازدید 213

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resourcesدانلود 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resourcesاستناد 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resourcesمرجع 0
اطلاعات دوره: 
  • سال: 

    2015
  • دوره: 

    3
  • شماره: 

    1 (9)
  • صفحات: 

    22-28
تعامل: 
  • استنادات: 

    0
  • بازدید: 

    644
  • دانلود: 

    0
چکیده: 

Credit scoring is an important topic, and banks collect different data from their loan applicant to make an appropriate and correct decision. Rule bases are of more attention in credit decision making because of their ability to explicitly distinguish between good and bad applicants. The credit scoring datasets are usually imbalanced. This is mainly because the number of good applicants in a portfolio of loan is usually much higher than the number of loans that default. This paper use previous applied rule bases in credit scoring, including RIPPER, OneR, Decision table, PART and C4.5 to study the reliability and results of sampling on its own dataset.A real database of one of an Iranian export development bank is used and, imbalanced data issues are investigated by randomly Oversampling the minority class of defaulters, and three times under sampling of majority of non-defaulters class. The performance criterion chosen to measure the reliability of rule extractors is the area under the receiver operating characteristic curve (AUC), accuracy and number of rules. Friedman’s statistic is used to test for significance differences between techniques and datasets. The results from study show that PART is better and good and bad samples of data affect its results less.

شاخص‌های تعامل:   مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

بازدید 644

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resourcesدانلود 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resourcesاستناد 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resourcesمرجع 4
اطلاعات دوره: 
  • سال: 

    2015
  • دوره: 

    1
تعامل: 
  • بازدید: 

    127
  • دانلود: 

    0
چکیده: 

IN THIS PAPER, FUZZY C-MEANS CLUSTERING AND ROTATION FOREST (RF) ARE COMBINED TO CONSTRUCT A HIGH PERFORMANCE CLASSIFIER FOR IMBALANCED DATA CLASSIFICATION. DATA SAMPLES ARE CLUSTERED VIA FUZZY CLUSTERING AND THEN FUZZY MEMBERSHIP FUNCTION MATRIX IS ADDED INTO DATA SAMPLES. THEREFORE, CLUSTERS MEMBERSHIPS OF SAMPLES ARE UTILIZED AS NEW FEATURES THAT ARE ADDED INTO THE ORIGINAL FEATURES. AFTER THAT, RF IS UTILIZED FOR CLASSIFICATION WHERE THE NEW SET OF FEATURES AS WELL AS THE ORIGINAL ONES ARE TAKEN INTO ACCOUNT IN THE FEATURE SUBSPACING PHASE. THE PROPOSED ALGORITHM UTILIZES SMOTE OVERSAMPLING ALGORITHM FOR BALANCING DATA SAMPLES. THE OBTAINED RESULTS CONFIRM THAT OUR PROPOSED METHOD OUTPERFORMS THE OTHER WELL-KNOWN BAGGING ALGORITHMS. ...

شاخص‌های تعامل:   مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

بازدید 127

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resourcesدانلود 0
litScript
telegram sharing button
whatsapp sharing button
linkedin sharing button
twitter sharing button
email sharing button
email sharing button
email sharing button
sharethis sharing button